As in the presentation, we will use the synthetic data set based on the from the Public Use File (PUF) of the GESIS Panel Special Survey on the Coronavirus SARS-CoV-2 Outbreak in Germany for the following exercises. As a reminder: This data set should be stored in the data subfolder in the folder with the course materials. For the exercises in this session and the following one, it is also helpful to consult the codebook for the original data set.
In the following exercises, we will perform some basic data wrangling tasks.
Of course, as always, before we can do that, we need to load the tidyverse package(s) and import the data.
library(tidyverse)
gpc <- read_csv("../data/ZA5667_v1-0-0_Stata14_synthetic-data.csv")
gpc_info that only contains the (binary) variables that asked about the use of different sources of information about the Corona virus. To find the right names, you can check the codebook (search for “media consumption”) or have a look at the clue for this task.
hzcy084a, and the last one is hzcy095a. They appear consecutively in the data set.
gpc_info <- gpc %>%
select(hzcy084a:hzcy095a)
gpc_info object): info_nat_pub_br, info_nat_pr_br, info_nat_np, info_loc_pub_br, info_loc_pr_br, info_loc_np, info_fb, info_other_sm, info_personal, info_other, info_none.
dplyr function is new_name = old_name.
gpc_info <- gpc_info %>%
rename(info_nat_pub_br = hzcy084a,
info_nat_pr_br = hzcy085a,
info_nat_np = hzcy086a,
info_loc_pub_br = hzcy087a,
info_loc_pr_br = hzcy088a,
info_loc_np = hzcy089a,
info_fb = hzcy090a,
info_other_sm = hzcy091a,
info_personal = hzcy092a,
info_other = hzcy093a,
info_none = hzcy095a)
select() command.
gpc_info <- gpc %>%
select(info_nat_pub_br = hzcy084a,
info_nat_pr_br = hzcy085a,
info_nat_np = hzcy086a,
info_loc_pub_br = hzcy087a,
info_loc_pr_br = hzcy088a,
info_loc_np = hzcy089a,
info_fb = hzcy090a,
info_other_sm = hzcy091a,
info_personal = hzcy092a,
info_other = hzcy093a,
info_none = hzcy095a)
gpc_afd that only contains (simulated) respondents who report that they intend to vote in the next German federal election and that they intend to vote for the right-wing populist party AfD (Alternative fuer Deutschland).
intention_to_vote and choice_of_party and the values we want to filter for are 2 (Yes), and 6 (AfD), respectively.
gpc_afd <- gpc %>%
filter(intention_to_vote == 2,
choice_of_party == 6)
gpc_middle_aged that only includes (simulated) respondents aged 36 to 50.
age_cat and the values of that variable we are looking for are 4 to 6. You can use the helper function between() here (remember that the values you provide to this function are inclusive).
gpc_middle_aged <- gpc %>%
filter(between(age_cat, 4, 6))
gpc data set based on the self-reported political orientation from right (high) to left (low). You do not need to save the resulting dataframe as a new object, but you should somehow check whether your code worked.
political_orientation which goes from 0 (left) to 10 (right). We want to sort in descending order. To check the result, we can use the glimpse() function at the end of the pipe.
gpc %>%
arrange(desc(political_orientation)) %>%
glimpse()
## Rows: 3,765
## Columns: 111
## $ cohort <dbl> 2, 1, 3, 1, 1, 2, 1, 1, 1, 2, 3, 3, 1, 3, 3, 1, ~
## $ sex <dbl> 2, 1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, ~
## $ age_cat <dbl> 7, 10, 7, 7, 10, 7, 6, 10, 10, 7, 1, 8, 5, 10, 1~
## $ education_cat <dbl> 1, 2, 1, 1, 2, 2, 3, 2, 2, 1, 3, 1, 3, 2, 2, 3, ~
## $ intention_to_vote <dbl> 2, 2, 2, 2, 2, NA, 1, 2, 2, 2, 97, 2, 2, 2, 2, 2~
## $ choice_of_party <dbl> 6, 6, 7, 7, 6, NA, 98, 6, 6, 6, 97, 1, 6, 6, 6, ~
## $ political_orientation <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, ~
## $ marstat <dbl> 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 2, 1, 3, 3, 1, 1, ~
## $ household <dbl> 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 1, 2, 3, ~
## $ hzcy001a <dbl> 4, 5, NA, 1, 4, 3, 1, 3, 5, 2, NA, 4, 5, 1, 1, 5~
## $ hzcy002a <dbl> 4, 5, NA, 1, 4, 2, 1, 4, 4, 3, NA, 4, 5, 1, 1, 6~
## $ hzcy003a <dbl> 4, 5, NA, 1, 5, 1, 1, 2, 5, 1, NA, 3, 4, 1, 1, 3~
## $ hzcy004a <dbl> 3, 5, NA, 4, 97, 4, 1, 3, 6, 4, NA, 4, 6, 1, 1, ~
## $ hzcy005a <dbl> 4, 4, NA, 1, 4, 3, 1, 3, 4, 2, NA, 3, 5, 1, 1, 5~
## $ hzcy006a <dbl> 1, 1, NA, 1, 1, 1, 1, 0, 1, 0, NA, 1, 1, 0, 1, 1~
## $ hzcy007a <dbl> 1, 1, NA, 1, 1, 1, 1, 0, 1, 0, NA, 1, 1, 0, 1, 1~
## $ hzcy008a <dbl> 0, 0, NA, 0, 1, 1, 1, 0, 0, 0, NA, 0, 1, 0, 0, 0~
## $ hzcy009a <dbl> 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, NA, 0, 1, 0, 0, 0~
## $ hzcy010a <dbl> 0, 1, NA, 0, 1, 0, 0, 0, 1, 0, NA, 0, 0, 0, 0, 0~
## $ hzcy011a <dbl> 1, 1, NA, 1, 1, 1, 1, 1, 1, 0, NA, 1, 1, 0, 1, 1~
## $ hzcy012a <dbl> 0, 1, NA, 1, 1, 0, 0, 1, 1, 0, NA, 1, 0, 0, 1, 0~
## $ hzcy013a <dbl> 0, 1, NA, 1, 1, 0, 1, 1, 1, 0, NA, 0, 0, 0, 1, 0~
## $ hzcy014a <dbl> 1, 1, NA, 1, 1, 1, 1, 1, 1, 0, NA, 1, 1, 0, 1, 0~
## $ hzcy015a <dbl> 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0~
## $ hzcy016a <dbl> 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0~
## $ hzcy018a <dbl> 0, 0, NA, 0, 0, 0, 0, 0, 0, 1, NA, 0, 0, 1, 0, 0~
## $ hzcy019a <dbl> 4, 5, NA, 3, 5, 3, 5, 5, 4, 3, NA, 2, 3, 4, 4, 4~
## $ hzcy020a <dbl> 5, 5, NA, 4, 5, 5, NA, 4, 4, 3, NA, 3, 5, 4, 4, ~
## $ hzcy021a <dbl> 5, 5, NA, 5, 5, 5, 4, 4, 4, 2, NA, 4, 5, 1, 4, 4~
## $ hzcy022a <dbl> 4, 5, NA, 4, 5, NA, 4, 3, 4, 2, NA, 4, 5, 5, 4, ~
## $ hzcy023a <dbl> 5, 5, NA, 4, 5, 5, 5, 3, 5, 3, NA, 3, 5, 5, 4, 4~
## $ hzcy024a <dbl> 5, 5, NA, 4, 5, 4, 4, 3, 5, 1, NA, 3, 5, 5, 2, 3~
## $ hzcy025a <dbl> 3, 5, NA, 4, 5, 2, 4, 2, 5, 1, NA, 3, 4, 5, 1, 4~
## $ hzcy026a <dbl> 1, 1, NA, 1, 1, 2, 2, 1, 1, 1, NA, 1, 1, 1, 2, 1~
## $ hzcy027a <dbl> 5, 4, NA, 5, 5, NA, NA, 4, 5, 5, NA, 3, 5, 4, NA~
## $ hzcy028a <dbl> 3, 1, NA, 1, 3, NA, NA, 3, 4, 3, NA, 1, 3, 4, NA~
## $ hzcy029a <dbl> 5, 5, NA, 5, 5, NA, NA, 1, 5, 3, NA, 5, 5, 5, NA~
## $ hzcy030a <dbl> 5, 5, NA, 3, 5, NA, NA, 4, 5, 3, NA, 5, 5, 5, NA~
## $ hzcy031a <dbl> 5, 5, NA, 3, 4, NA, NA, 4, 5, 3, NA, 2, 5, 4, NA~
## $ hzcy032a <dbl> 5, 5, NA, 3, 4, NA, NA, 5, 5, 3, NA, 3, 5, 5, NA~
## $ hzcy033a <dbl> NA, NA, NA, NA, NA, 4, 5, NA, NA, NA, NA, NA, NA~
## $ hzcy034a <dbl> NA, NA, NA, NA, NA, 4, 4, NA, NA, NA, NA, NA, NA~
## $ hzcy035a <dbl> NA, NA, NA, NA, NA, 3, 5, NA, NA, NA, NA, NA, NA~
## $ hzcy036a <dbl> NA, NA, NA, NA, NA, 3, 3, NA, NA, NA, NA, NA, NA~
## $ hzcy037a <dbl> NA, NA, NA, NA, NA, 1, 5, NA, NA, NA, NA, NA, NA~
## $ hzcy038a <dbl> NA, NA, NA, NA, NA, 2, 2, NA, NA, NA, NA, NA, NA~
## $ hzcy039a <dbl> NA, NA, NA, NA, NA, 3, 4, NA, NA, NA, NA, NA, NA~
## $ hzcy040a <dbl> 3, 3, NA, 3, 3, 3, 2, 2, 3, 2, NA, 2, 3, 4, 2, 2~
## $ hzcy041a <dbl> 3, 3, NA, 2, 3, 3, 3, 4, 3, 5, NA, 2, 3, 4, 2, 3~
## $ hzcy042a <dbl> 3, 4, NA, 3, 3, 4, 1, 2, 3, 4, NA, 3, 3, 3, 3, 1~
## $ hzcy043a <dbl> 3, 4, NA, 3, 3, 4, 2, 3, 3, 5, NA, 2, 3, 3, 3, 2~
## $ hzcy044a <dbl> 3, 4, NA, 5, 5, 4, 4, 98, 4, 5, NA, 5, 5, 5, 4, ~
## $ hzcy045a <dbl> 3, 4, NA, 3, 5, 5, 5, 3, 3, 3, NA, 5, 5, 4, 3, 3~
## $ hzcy046a <dbl> 3, 4, NA, 4, 5, 4, 4, 3, 3, 4, NA, 5, 98, 4, 3, ~
## $ hzcy047a <dbl> 4, 4, NA, 3, 5, 5, 4, 5, 5, 4, NA, 5, 4, 98, 1, ~
## $ hzcy048a <dbl> 4, 2, NA, 3, 5, 5, 4, 3, 3, 2, NA, 4, 5, 4, 2, 2~
## $ hzcy049a <dbl> 4, 2, NA, 3, 5, 2, 1, 2, 3, 2, NA, 4, 5, 2, 3, 1~
## $ hzcy050a <dbl> 4, 4, NA, 4, 5, 5, 4, 3, 3, 3, NA, 5, 5, 1, 2, 2~
## $ hzcy051a <dbl> 4, 3, NA, NA, 98, 5, 5, NA, 3, 4, NA, 5, 5, 4, 2~
## $ hzcy052a <dbl> 4, 4, NA, 4, 98, 5, 5, 4, 4, 4, NA, 5, 5, 4, 3, ~
## $ hzcy053a <dbl> 7, 5, NA, 7, 5, 7, 1, 5, 5, 5, NA, 5, 1, 5, 8, 1~
## $ hzcy054a <dbl> NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, 1~
## $ hzcy055a <dbl> NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, 0~
## $ hzcy056a <dbl> NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, 0~
## $ hzcy057a <dbl> NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, 0~
## $ hzcy058a <dbl> NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, 0~
## $ hzcy059a <dbl> NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, 0~
## $ hzcy060a <dbl> NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, 0~
## $ hzcy061a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy062a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy063a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy064a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy065a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy066a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy067a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy068a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy069a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy070a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy071a <dbl> 2, 2, NA, 2, 2, 2, 2, 2, 2, 2, NA, 2, 1, 2, 2, 1~
## $ hzcy072a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy073a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy074a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy075a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy076a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy077a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy078a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy079a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy080a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy081a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy083a <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ hzcy084a <dbl> 1, 1, NA, 1, 1, 1, 0, 1, 1, 1, NA, 1, 0, 0, 1, 0~
## $ hzcy085a <dbl> 1, 0, NA, 0, 1, 1, 0, 1, 1, 1, NA, 1, 1, 1, 0, 0~
## $ hzcy086a <dbl> 0, 1, NA, 0, 1, 0, 0, 0, 0, 0, NA, 0, 0, 1, 0, 0~
## $ hzcy087a <dbl> 0, 1, NA, 1, 1, 1, 0, 0, 1, 1, NA, 1, 0, 0, 0, 0~
## $ hzcy088a <dbl> 0, 0, NA, 0, 0, 1, 1, 1, 0, 1, NA, 0, 0, 1, 0, 0~
## $ hzcy089a <dbl> 1, 1, NA, 0, 0, 1, 1, 0, 1, 1, NA, 1, 1, 1, 0, 0~
## $ hzcy090a <dbl> 0, 0, NA, 1, 1, 0, 0, 1, 1, 0, NA, 0, 0, 1, 0, 0~
## $ hzcy091a <dbl> 0, 0, NA, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0, 0, 0, 0~
## $ hzcy092a <dbl> 0, 0, NA, 1, 1, 1, 1, 0, 1, 1, NA, 1, 1, 1, 1, 0~
## $ hzcy093a <dbl> 0, 0, NA, 0, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0~
## $ hzcy095a <dbl> 0, 0, NA, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0, 0, 0, 0~
## $ hzcy096a <dbl> NA, NA, NA, 1, 1, NA, NA, 1, 1, NA, NA, NA, NA, ~
## $ hzcy097a <dbl> NA, NA, NA, 1, 0, NA, NA, 0, 0, NA, NA, NA, NA, ~
## $ hzcy098a <dbl> NA, NA, NA, 0, 0, NA, NA, 0, 0, NA, NA, NA, NA, ~
## $ hzcy099a <dbl> NA, NA, NA, 0, 1, NA, NA, 1, 1, NA, NA, NA, NA, ~
## $ hzza003a <dbl> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, ~
## $ hzzq009a <dbl> 4, 5, NA, 4, 5, 4, 4, 4, NA, 5, NA, 5, 4, 4, 3, ~
## $ hzzq023a <dbl> 4, 5, NA, 4, 5, 5, 5, 4, 4, 5, NA, 5, 4, 5, 3, 4~
## $ hzzp201a <dbl> 31, 32, NA, 32, 31, 31, 31, 31, 31, 32, NA, 31, ~
## $ hzzp204a <dbl> 453, NA, NA, NA, 252, 699, 1280, 397, 413, NA, N~
## $ hzzp207a <dbl> 1584692399, 1584725448, NA, 1584786274, 15846183~